Search for unexplored effects in speech production

نویسندگان

  • Cecil H. Coker
  • M. H. Krane
  • B. Y. Reis
  • R. A. Kubli
چکیده

Speech coders invariably spend about 1/3 of their bits replicating a number of effects collectively termed "excitation". The need for so much bandwidth stems from two causes. The frame rate must be relatively high, because transient changes must be resolved. The data needed for each frame is high because unpredictable broad-band components must be reproduced. Here we discuss three projects to learn more about these elusive aspects of speech. One project models the transient behavior. Two others seek to characterize stochastic processes that accompany periodic vibration in voiced sounds. 1. Sequential effects in excitation Figure 1 (a) shows a waveform of the beginning of the word heed. Fig. 1 (b) is probable glottal behavior based on data for a similar utterance — word-initial /h&/. The instrumentation is photoglotography (PGG), also known as transillumination. A fiberoptic light source is inserted through the nose, and positioned above the glottis. In a darkened room, the outline of the glottis can be seen on the lower larynx and pharynx. A light sensor "watching" about 5 cm in this region gets a reasonably linear response with glottal area. Fig. 1 (c) is Linear Prediction residual for the waveform in (a). Height of the sharp impulses reflect abruptness of glottal closure. In Fig. 1, the glottis is open for /h/; then closes for the vowel. Glottal vibration starts shortly before glottal "rest area" gets roughly to zero. In optical evidence, glottal area seems to reach a rather stable state perhaps 25 msec after vibration begins. However, the waveform and prediction residual show substantial change for another 60 msec. Something in the larynx is still changing. But what? Here is offered a physiological explanation. A model is developed and its coefficients are tuned from available data. Although the model was suggested by a proposed explanation, it does not depend on details of that proposal. The model is anchored only in the notion that measurable glottal area and LP residual derive from a single variable, and conversely that behavior of that variable can be inferred from glottal instrumentation and LP data. Fig. 2 is a sketch of the glottal mechanism. The L-shaped structures shaded gray are the arytenoid cartilages. The vocal cords attach to the tips of these cartilages, the vocal processes. The glottis is closed by tensing several muscles — principal among them, the exterior thyro-arytenoids. As tension increases, the arytenoids rotate and the vocal cords move together. At some point, however, the opposing arytenoid tips press together, and can move no further. But there is a mechanism for the thyro-arytenoids to have effect after arytenoid motion is blocked. The cords are pushed apart in the middle by air pressure, and by stretch forces tending to restore the breathing state. The exterior thyro-arytenoids are curved and push inward through a draw-string action. Increasing tension, after the arytenoids block, pushes the cords in at the sides. This has a modest effect on glottal area. But a more profound effect is to alter the geometry of glottal closure. Fig. 3 shows the progress of glottal closure for three adjustments. The lower left is a magnified view of glottal area vs. time for conditions a, b and c; the lower middle shows glottal area derivative, and lower right, the spectral consequences. Condition a is a partially open glottis as might occur in /z/ (exaggerated). The arytenoid tips are separated a small distance, making the vocal-cord configuration somewhat triangular. During vibration, vocal-cord closure begins at the front and progresses toward the arytenoids in the back. This "zipper" action has a time course of glottal area closely resembling a decaying exponential. This contributes a 6 dB/octave spectrum down-turn at a frequency FT varying roughly inversely with measurable glottal area (see Fig. 5). (b) Figure 1 For the /hi/ in "heed"; (a) waveform; (b) approximate glottal (inferred from PGG for /h&/); (c) Linear Prediction residual. Arytenoid

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Individual differences and development of speech act production

This  study  examined  the  effects  of  individual  difference  (ID)  factors  on  changing  pragmatic abilities  among  L2  learners  of  English.  Participants  were  48  Japanese  EFL  students  in  an English-medium  university  in  Japan.  They  completed  a  pragmatic  speaking  test  (k=12)  that assessed  their  ability  to  produce  two  speech  acts:  requests  and  opinions,  in  hi...

متن کامل

مقایسه تأثیر درمان مبتنی بر آموزش تولید با آموزش حرکات دهانی غیر گفتاری بر گفتارکودکان 6-4 ساله ی مبتلا به اختلال واجی

Objective: speech sound disorders are among the most common speech disorders in children. Non-speech oral motor exercises have long been used as a facilitative activity throughout therapy sessions for a wide variety of speech disorders by speech-language pathologists. But there are few empirical controlled data to evaluate its effectiveness. This study aimed at comparing the effects of therapeu...

متن کامل

تدوین پروتکل مداخلات به هنگام در تأخیر تکامل گفتار و زبان کودکان خردسال: یک تجربه منحصربه فرد در کشور

Objective Speech and language disorders, when happening during childhood will ultimately lead to important negative outcomes in the life of the child. The farther we move from this critical period, that is, the first 3 years of life, the less will be the positive effects of environmental stimuli on the development of speech and language. Early detection of children at risk for or...

متن کامل

Gravitational Search Algorithm to Solve the K-of-N Lifetime Problem in Two-Tiered WSNs

Wireless Sensor Networks (WSNs) are networks of autonomous nodes used for monitoring an environment. In designing WSNs, one of the main issues is limited energy source for each sensor node. Hence, offering ways to optimize energy consumption in WSNs which eventually increases the network lifetime is strongly felt. Gravitational Search Algorithm (GSA) is a novel stochastic population-based meta-...

متن کامل

The Effects of Collaborative Translation Task on the Apology Speech Act Production of Iranian EFL Learners

The present study aims to investigate the relative effectiveness of different types of pragmatic instruction including two collaborative translation tasks and two structured input tasks with and without explicit pragmatic instruction on the production of apologetic utterances by low-intermediate EFL learners. One hundred and fifty university students in four experimental groups and one control ...

متن کامل

بررسی اثر فیدبک شنوائی در تولید گفتار بعد از عمل کوکلئار ایمپلنت

The main goal of this study is to determine the auditory feedback effects in improvement of speech production process in prelingual totally deaf children who used cochlear implant prosthesis. For this reason, we recorded speech of four prelingual cochlear implant children pre and post of operation. Then we extract some static features of vowels-such as fundamental frequency, formant frequencies...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996